AITopics | model component

Collaborating Authors

model component

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Optimal ablation for interpretability

Neural Information Processing SystemsMar-22-2026, 10:36:27 GMT

Interpretability studies often involve tracing the flow of information through machine learning models to identify specific model components that perform relevant computations for tasks of interest.

artificial intelligence, machine learning, neural information processing system 37, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Towards Automated Circuit Discovery for Mechanistic Interpretability

Neural Information Processing SystemsFeb-10-2026, 00:54:51 GMT

acdc, activation, subgraph, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Washington > King County > Seattle (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
(6 more...)

Genre:

Workflow (0.94)
Research Report (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

On the creation of narrow AI: hierarchy and nonlocality of neural network skills

Michaud, Eric J., Parker-Sartori, Asher, Tegmark, Max

arXiv.org Artificial IntelligenceOct-31-2025

We study the problem of creating strong, yet narrow, AI systems. While recent AI progress has been driven by the training of large general-purpose foundation models, the creation of smaller models specialized for narrow domains could be valuable for both efficiency and safety. In this work, we explore two challenges involved in creating such systems, having to do with basic properties of how neural networks learn and structure their representations. The first challenge regards when it is possible to train narrow models from scratch. Through experiments on a synthetic task, we find that it is sometimes necessary to train networks on a wide distribution of data to learn certain narrow skills within that distribution. This effect arises when skills depend on each other hierarchically, and training on a broad distribution introduces a curriculum which substantially accelerates learning. The second challenge regards how to transfer particular skills from large general models into small specialized models. We find that model skills are often not perfectly localized to a particular set of prunable components. However, we find that methods based on pruning can still outperform distillation. We investigate the use of a regularization objective to align desired skills with prunable components while unlearning unnecessary skills.

arxiv preprint arxiv, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2505.15811

Country: Europe (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Extending Load Forecasting from Zonal Aggregates to Individual Nodes for Transmission System Operators

Triebe, Oskar, Passow, Fletcher, Wittner, Simon, Wagner, Leonie, Arend, Julio, Sun, Tao, Zanocco, Chad, Miltner, Marek, Ghesmati, Arezou, Tsai, Chen-Hao, Bergmeir, Christoph, Rajagopal, Ram

arXiv.org Artificial IntelligenceOct-20-2025

The reliability of local power grid infrastructure is challenged by sustainable energy developments increasing electric load uncertainty. Transmission System Operators (TSOs) need load forecasts of higher spatial resolution, extending current forecasting operations from zonal aggregates to individual nodes. However, nodal loads are less accurate to forecast and require a large number of individual forecasts, which are hard to manage for the human experts assessing risks in the control room's daily operations (operator). In collaboration with a TSO, we design a multi-level system that meets the needs of operators for hourly day-ahead load forecasting. Utilizing a uniquely extensive dataset of zonal and nodal net loads, we experimentally evaluate our system components. First, we develop an interpretable and scalable forecasting model that allows for TSOs to gradually extend zonal operations to include nodal forecasts. Second, we evaluate solutions to address the heterogeneity and volatility of nodal load, subject to a trade-off. Third, our system is manageable with a fully parallelized single-model forecasting workflow. Our results show accuracy and interpretability improvements for zonal forecasts, and substantial improvements for nodal forecasts. Keywords: Short-Term Load Forecast, Transmission System Operator, Global Forecasting Model, Hierarchical Forecasting, Distributed Energy Resources, Electrical Power Grid1. Introduction Electric transmission system operators (TSOs) face increasing volatility in electric load due to distributed and renewable energy generation, climate events, and electrification [1]. This volatility complicates load forecasting, which is essential to TSO operations. TSOs must ensure that electricity generation matches load at all times, and the distribution of power across their territory does not overwhelm any infrastructure component. To accomplish this, they use day-ahead load forecasts to inform where to dispatch generators each hour of the coming day. Growing electrification and distributed generation increase volatility of'net load' - local consumption minus generation - in some places and not others, as adoption of these technologies proceeds unevenly. This could put a TSO's medium-voltage grid components, for example sub-transmission lines and primary distribution substations, at risk of damage if load forecasts miss unexpected local changes [2, 3, 4].

artificial intelligence, data mining, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2510.14983

Country:

Europe (1.00)
North America > United States (0.46)

Genre: Research Report > New Finding (0.86)

Industry: Energy > Power Industry (1.00)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback

Socially inspired Adaptive Coalition and Client Selection in Federated Learning

Licciardi, Alessandro, Raineri, Roberta, Proskurnikov, Anton, Rondoni, Lamberto, Zino, Lorenzo

arXiv.org Artificial IntelligenceOct-16-2025

Federated Learning (FL) enables privacy-preserving collaborative model training, but its effectiveness is often limited by client data heterogeneity. We introduce a client-selection algorithm that (i) dynamically forms nonoverlapping coalitions of clients based on asymptotic agreement and (ii) selects one representative from each coalition to minimize the variance of model updates. Our approach is inspired by social-network modeling, leveraging homophily-based proximity matrices for spectral clustering and techniques for identifying the most informative individuals to estimate a group's aggregate opinion. We provide theoretical convergence guarantees for the algorithm under mild, standard FL assumptions. Finally, we validate our approach by benchmarking it against three strong heterogeneity-aware baselines; the results show higher accuracy and faster convergence, indicating that the framework is both theoretically grounded and effective in practice.

artificial intelligence, data mining, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2506.02897

Country: North America > United States (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

Towards Automated Circuit Discovery for Mechanistic Interpretability

Neural Information Processing SystemsOct-8-2025, 10:30:19 GMT

Through considerable effort and intuition, several recent works have reverse-engineered nontrivial behaviors of transformer models. This paper systematizes the mechanistic interpretability process they followed. First, researchers choose a metric and dataset that elicit the desired model behavior.

acdc, activation, subgraph, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Washington > King County > Seattle (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
(6 more...)

Genre:

Workflow (0.94)
Research Report (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

ExPLAIND: Unifying Model, Data, and Training Attribution to Study Model Behavior

Eichin, Florian, Du, Yupei, Mondorf, Philipp, Matveev, Maria, Plank, Barbara, Hedderich, Michael A.

arXiv.org Artificial IntelligenceOct-2-2025

Post-hoc interpretability methods typically attribute a model's behavior to its components, data, or training trajectory in isolation. This leads to explanations that lack a unified view and may miss key interactions. While combining existing methods or applying them at different training stages offers broader insights, such approaches usually lack theoretical support. In this work, we present ExPLAIND, a unified framework that integrates all these perspectives. First, we generalize recent work on gradient path kernels, which reformulate models trained by gradient descent as a kernel machine, to realistic settings like AdamW. We empirically validate that a CNN and a Transformer are accurately replicated by this reformulation. Second, we derive novel parameter- and step-wise influence scores from the kernel feature maps. Their effectiveness for parameter pruning is comparable to existing methods, demonstrating their value for model component attribution. Finally, jointly interpreting model components and data over the training process, we leverage ExPLAIND to analyze a Transformer that exhibits Grokking. Our findings support previously proposed stages of Grokking, while refining the final phase as one of alignment of input embeddings and final layers around a representation pipeline learned after the memorization phase. Overall, ExPLAIND provides a theoretically grounded, unified framework to interpret model behavior and training dynamics.

artificial intelligence, explaind, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2505.20076

Country: Europe > Germany (0.46)

Genre: Research Report > New Finding (0.88)

Industry: Law (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.34)

Add feedback

Interpreting Language Models Through Concept Descriptions: A Survey

Feldhus, Nils, Kopf, Laura

arXiv.org Artificial IntelligenceOct-2-2025

Understanding the decision-making processes of neural networks is a central goal of mechanistic interpretability. In the context of Large Language Models (LLMs), this involves uncovering the underlying mechanisms and identifying the roles of individual model components such as neurons and attention heads, as well as model abstractions such as the learned sparse features extracted by Sparse Autoencoders (SAEs). A rapidly growing line of work tackles this challenge by using powerful generator models to produce open-vocabulary, natural language concept descriptions for these components. In this paper, we provide the first survey of the emerging field of concept descriptions for model components and abstractions. We chart the key methods for generating these descriptions, the evolving landscape of automated and human metrics for evaluating them, and the datasets that underpin this research. Our synthesis reveals a growing demand for more rigorous, causal evaluation. By outlining the state of the art and identifying key challenges, this survey provides a roadmap for future research toward making models more transparent.

computational linguistic, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2510.01048

Country: